智能论文笔记

Stripformer: Strip Transformer for Fast Image Deblurring

Fu-Jen Tsai , Yan-Tsung Peng , Yen-Yu Lin , Chung-Chi Tsai , Chia-Wen Lin

分类：计算机视觉

2022-04-10

在动态场景中拍摄的图像可能包含不必要的运动模糊，从而大大降低视觉质量。这种模糊会导致短期和远程特定区域的平滑伪像，通常是方向性和不均匀的，很难去除。受到变压器在计算机视觉和图像处理任务的最新成功的启发，我们开发了Stripformer，这是一种基于变压器的体系结构，该体系结构构建了内部和跨条纹代币，以在水平和垂直方向上重新构建图像特征，以捕获模糊的模式，以不同于不同方向。它堆叠了隔离的内带和串间注意层，以揭示模糊的幅度。除了检测各种取向和幅度的区域特异性模式外，Stripformer还是一个令牌效率和参数有效的变压器模型，要求比Vanilla变压器更少的内存使用和计算成本要少得多，但在不依赖巨大训练数据的情况下工作得更好。实验结果表明，在动态场景中，脱衣舞素对最新模型的表现良好。

translated by 谷歌翻译

BANet: Blur-aware Attention Networks for Dynamic Scene Deblurring

Fu-Jen Tsai , Yan-Tsung Peng , Yen-Yu Lin , Chung-Chi Tsai , Chia-Wen Lin

分类：计算机视觉

2021-01-19

图像运动模糊通常是由于移动物体或摄像头摇动而导致的。这种模糊通常是方向性的，不均匀。先前的研究工作试图通过使用自我注意力的自我次数多尺度或多斑架构来解决非均匀的模糊。但是，使用自我电流框架通常会导致更长的推理时间，而像素间或通道间的自我注意力可能会导致过度记忆使用。本文提出了模糊的注意力网络（BANET），该网络通过单个正向通行证完成了准确有效的脱脂。我们的Banet利用基于区域的自我注意力，并通过多内核条池汇总到不同程度的模糊模式，并具有级联的平行扩张卷积，以汇总多尺度内容特征。关于GoPro和Hide基准的广泛实验结果表明，所提出的班轮在模糊的图像修复中表现出色，并可以实时提供Deblurred结果。

translated by 谷歌翻译

Adapting to Latent Subgroup Shifts via Concepts and Proxies

Ibrahim Alabdulmohsin , Nicole Chiou , Alexander D'Amour , Arthur Gretton , Sanmi Koyejo , Matt J. Kusner , Stephen R. Pfohl , Olawale Salaudeen , Jessica Schrouff , Katherine Tsai

分类： (统计)机器学习 | 人工智能 | 机器学习

2022-12-21

We address the problem of unsupervised domain adaptation when the source domain differs from the target domain because of a shift in the distribution of a latent subgroup. When this subgroup confounds all observed data, neither covariate shift nor label shift assumptions apply. We show that the optimal target predictor can be non-parametrically identified with the help of concept and proxy variables available only in the source domain, and unlabeled data from the target. The identification results are constructive, immediately suggesting an algorithm for estimating the optimal predictor in the target. For continuous observations, when this algorithm becomes impractical, we propose a latent variable model specific to the data generation process at hand. We show how the approach degrades as the size of the shift changes, and verify that it outperforms both covariate and label shift adjustment.

translated by 谷歌翻译

Learning Object-level Point Augmentor for Semi-supervised 3D Object Detection

Cheng-Ju Ho , Chen-Hsuan Tai , Yi-Hsuan Tsai , Yen-Yu Lin , Ming-Hsuan Yang

分类：计算机视觉

2022-12-19

Semi-supervised object detection is important for 3D scene understanding because obtaining large-scale 3D bounding box annotations on point clouds is time-consuming and labor-intensive. Existing semi-supervised methods usually employ teacher-student knowledge distillation together with an augmentation strategy to leverage unlabeled point clouds. However, these methods adopt global augmentation with scene-level transformations and hence are sub-optimal for instance-level object detection. In this work, we propose an object-level point augmentor (OPA) that performs local transformations for semi-supervised 3D object detection. In this way, the resultant augmentor is derived to emphasize object instances rather than irrelevant backgrounds, making the augmented data more useful for object detector training. Extensive experiments on the ScanNet and SUN RGB-D datasets show that the proposed OPA performs favorably against the state-of-the-art methods under various experimental settings. The source code will be available at https://github.com/nomiaro/OPA.

translated by 谷歌翻译

Associations Between Natural Language Processing (NLP) Enriched Social Determinants of Health and Suicide Death among US Veterans

Avijit Mitra , Richeek Pradhan , Rachel D Melamed , Kun Chen , David C Hoaglin , Katherine L Tucker , Joel I Reisman , Zhichao Yang , Weisong Liu , Jack Tsai

分类：自然语言处理

2022-12-11

Importance: Social determinants of health (SDOH) are known to be associated with increased risk of suicidal behaviors, but few studies utilized SDOH from unstructured electronic health record (EHR) notes. Objective: To investigate associations between suicide and recent SDOH, identified using structured and unstructured data. Design: Nested case-control study. Setting: EHR data from the US Veterans Health Administration (VHA). Participants: 6,122,785 Veterans who received care in the US VHA between October 1, 2010, and September 30, 2015. Exposures: Occurrence of SDOH over a maximum span of two years compared with no occurrence of SDOH. Main Outcomes and Measures: Cases of suicide deaths were matched with 4 controls on birth year, cohort entry date, sex, and duration of follow-up. We developed an NLP system to extract SDOH from unstructured notes. Structured data, NLP on unstructured data, and combining them yielded seven, eight and nine SDOH respectively. Adjusted odds ratios (aORs) and 95% confidence intervals (CIs) were estimated using conditional logistic regression. Results: In our cohort, 8,821 Veterans committed suicide during 23,725,382 person-years of follow-up (incidence rate 37.18 /100,000 person-years). Our cohort was mostly male (92.23%) and white (76.99%). Across the six common SDOH as covariates, NLP-extracted SDOH, on average, covered 84.38% of all SDOH occurrences. All SDOH, measured by structured data and NLP, were significantly associated with increased risk of suicide. The SDOH with the largest effects was legal problems (aOR=2.67, 95% CI=2.46-2.89), followed by violence (aOR=2.26, 95% CI=2.11-2.43). NLP-extracted and structured SDOH were also associated with suicide. Conclusions and Relevance: NLP-extracted SDOH were always significantly associated with increased risk of suicide among Veterans, suggesting the potential of NLP in public health studies.

translated by 谷歌翻译

Object Goal Navigation with End-to-End Self-Supervision

So Yeon Min , Yao-Hung Hubert Tsai , Wei Ding , Ali Farhadi , Ruslan Salakhutdinov , Yonatan Bisk , Jian Zhang

分类：机器人 | 机器学习

2022-12-09

A household robot should be able to navigate to target locations without requiring users to first annotate everything in their home. Current approaches to this object navigation challenge do not test on real robots and rely on expensive semantically labeled 3D meshes. In this work, our aim is an agent that builds self-supervised models of the world via exploration, the same as a child might. We propose an end-to-end self-supervised embodied agent that leverages exploration to train a semantic segmentation model of 3D objects, and uses those representations to learn an object navigation policy purely from self-labeled 3D meshes. The key insight is that embodied agents can leverage location consistency as a supervision signal - collecting images from different views/angles and applying contrastive learning to fine-tune a semantic segmentation model. In our experiments, we observe that our framework performs better than other self-supervised baselines and competitively with supervised baselines, in both simulation and when deployed in real houses.

translated by 谷歌翻译

Automated Identification of Eviction Status from Electronic Health Record Notes

Zonghai Yao , Jack Tsai , Weisong Liu , David A. Levy , Emily Druhl , Joel I Reisman , Hong Yu

分类：自然语言处理

2022-12-06

Objective: Evictions are involved in a cascade of negative events that can lead to unemployment, homelessness, long-term poverty, and mental health problems. In this study, we developed a natural language processing system to automatically detect eviction incidences and their attributes from electronic health record (EHR) notes. Materials and Methods: We annotated eviction status in 5000 EHR notes from the Veterans Health Administration. We developed a novel model, called Knowledge Injection based on Ripple Effects of Social and Behavioral Determinants of Health (KIRESH), that has shown to substantially outperform other state-of-the-art models such as fine-tuning pre-trained language models like BioBERT and Bio_ClinicalBERT. Moreover, we designed a prompt to further improve the model performance by using the intrinsic connection between the two sub-tasks of eviction presence and period prediction. Finally, we used the Temperature Scaling-based Calibration on our KIRESH-Prompt method to avoid over-confidence issues arising from the imbalance dataset. Results: KIRESH-Prompt achieved a Macro-F1 of 0.6273 (presence) and 0.7115 (period), which was significantly higher than 0.5382 (presence) and 0.67167 (period) for just fine-tuning Bio_ClinicalBERT model. Conclusion and Future Work: KIRESH-Prompt has substantially improved eviction status classification. In future work, we will evaluate the generalizability of the model framework to other applications.

translated by 谷歌翻译

Are AlphaZero-like Agents Robust to Adversarial Perturbations?

Li-Cheng Lan , Huan Zhang , Ti-Rong Wu , Meng-Yu Tsai , I-Chen Wu , Cho-Jui Hsieh

分类：人工智能 | 机器学习 | 机器人

2022-11-07

The success of AlphaZero (AZ) has demonstrated that neural-network-based Go AIs can surpass human performance by a large margin. Given that the state space of Go is extremely large and a human player can play the game from any legal state, we ask whether adversarial states exist for Go AIs that may lead them to play surprisingly wrong actions. In this paper, we first extend the concept of adversarial examples to the game of Go: we generate perturbed states that are ``semantically'' equivalent to the original state by adding meaningless moves to the game, and an adversarial state is a perturbed state leading to an undoubtedly inferior action that is obvious even for Go beginners. However, searching the adversarial state is challenging due to the large, discrete, and non-differentiable search space. To tackle this challenge, we develop the first adversarial attack on Go AIs that can efficiently search for adversarial states by strategically reducing the search space. This method can also be extended to other board games such as NoGo. Experimentally, we show that the actions taken by both Policy-Value neural network (PV-NN) and Monte Carlo tree search (MCTS) can be misled by adding one or two meaningless stones; for example, on 58\% of the AlphaGo Zero self-play games, our method can make the widely used KataGo agent with 50 simulations of MCTS plays a losing action by adding two meaningless stones. We additionally evaluated the adversarial examples found by our algorithm with amateur human Go players and 90\% of examples indeed lead the Go agent to play an obviously inferior action. Our code is available at \url{https://PaperCode.cc/GoAttack}.

translated by 谷歌翻译

Towards Multimodal Multitask Scene Understanding Models for Indoor Mobile Agents

Yao-Hung Hubert Tsai , Hanlin Goh , Ali Farhadi , Jian Zhang

分类：计算机视觉 | 人工智能

2022-09-27

个性化移动代理中的感知系统需要开发室内场景理解模型，这些模型可以理解3D几何，捕获客观性，分析人类行为等。但是，与户外环境的模型相比，该方向并未得到充分探索（例如自动驾驶系统，包括行人预测，汽车检测，交通标志识别等）。在本文中，我们首先讨论主要挑战：不足，甚至没有标记为现实世界室内环境的数据，以及其他挑战，例如异质信息来源（例如RGB图像和LIDAR点云）之间的融合，建模关系建模关系在各种输出集（例如3D对象位置，深度估计和人类姿势）和计算效率之间。然后，我们描述MMISM（多模式输入多任务输出室内场景理解模型）来应对上述挑战。 MMISM认为RGB图像以及稀疏的LIDAR点是输入和3D对象检测，深度完成，人体姿势估计和语义分割作为输出任务。我们表明，MMISM在PAR上执行甚至比单任务模型更好。例如，我们在基准Arkitscenes数据集上将基线3D对象检测结果提高了11.7％。

translated by 谷歌翻译

Ki-67 Index Measurement in Breast Cancer Using Digital Image Analysis

Hsiang-Wei Huang , Wen-Tsung Huang , Hsun-Heng Tsai

分类：计算机视觉

2022-09-27

KI-67是一种核蛋白，可以在细胞增殖过程中产生。 Ki67指数在几种癌症中是有价值的预后变量。在乳腺癌中，该指数甚至经常检查许多患者。目前，病理学家使用免疫组织化学方法将KI-67阳性恶性细胞的百分比视为KI-67指数。较高的分数通常意味着更具侵略性的肿瘤行为。在临床实践中，KI-67指数的测量取决于视觉识别方法和手动计数。然而，视觉和手动评估方法是时间耗费，由于评估标准不同或评估中的肿瘤面积有限，因此可重复性差。在这里，我们使用数字图像处理技术，包括图像二进制和图像形态操作来创建数字图像分析方法来解释KI-67索引。然后，将10个乳腺癌标本用作高精度的验证（相关效率r = 0.95127）。借助数字图像分析，病理学家可以更有效地解释KI67指数，并具有出色的可重复性。

translated by 谷歌翻译